Add low-latency raw memory search by strongkeep-debug · Pull Request #173 · XortexAI/XMem

strongkeep-debug · 2026-05-11T18:21:08Z

Addresses #163.

This PR turns memory search into a true low-latency path. Raw search now goes through RetrievalPipeline.search_raw and returns ranked profile, temporal, summary, snippet, and code annotation hits without retrieval-plan tool selection. answer=true synthesizes from those already-fetched hits when a caller wants a generated answer, and the root /search alias is wired for clients that need the shorter path.

Area	Change	Evidence
Raw retrieval	Selected domains are searched directly and ranked by score, including the code annotation domain requested by the issue.	`test_raw_search_returns_ranked_hits_without_tool_selection` confirms no tool-selection call is made and verifies a code hit keeps file and symbol metadata.
API contract	`/v1/memory/search` accepts `code` in the domain list and includes it in the default raw search domain set.	`test_memory_search_route_accepts_code_domain` covers the request validator and serialized response shape.
Optional answer	`answer=true` synthesizes from collected raw hits without doing agentic retrieval planning first.	`test_root_search_alias_can_synthesize_answer` covers the alias and answer mode.
Caching and latency	Profile catalogs and retrieval plans are cached for the agentic path, and bounded p50/p95/p99 latency snapshots are recorded for raw, answer, and agentic modes.	`test_retrieval_pipeline_caches_catalog_and_retrieval_plan` covers cache reuse.
Robustness	Raw search now normalizes missing/non-finite backend scores and keeps healthy domain results when another requested domain fails.	`test_raw_search_skips_failed_domains_and_normalizes_scores` and the API route regression cover both pipeline and serialization behavior.

Validation was run locally:

.\.venv\Scripts\python.exe -m pytest -q
50 passed

.\.venv\Scripts\python.exe -m pytest tests\integration\test_retrieval_pipeline.py tests\api\test_memory_search_routes.py -q
12 passed

.\.venv\Scripts\python.exe -m ruff check src\pipelines\retrieval.py src\api\routes\memory.py tests\integration\test_retrieval_pipeline.py tests\api\test_memory_search_routes.py
All checks passed!

git diff --check
passed

gemini-code-assist

Code Review

This pull request introduces a low-latency raw search endpoint and enhances the existing search functionality with optional answer synthesis and latency tracking. Key changes include the implementation of search_raw and answer_from_sources in the retrieval pipeline, the addition of TTL-based caching for profile catalogs and retrieval plans, and the inclusion of detailed latency metrics in search responses. Feedback focuses on optimizing performance by parallelizing domain searches, ensuring the profile catalog retrieval is asynchronous to avoid blocking the event loop, and managing cache memory usage through bounded collections and hashed keys.

gemini-code-assist · 2026-05-11T18:26:41Z

+        if "profile" in domain_set:
+            results.extend(await self._search_profile_raw(query, user_id, top_k))
+        if "temporal" in domain_set:
+            results.extend(await self._search_temporal(query, user_id, top_k))
+        if "summary" in domain_set:
+            results.extend(await self._search_summary(query, user_id, top_k))
+        if "snippet" in domain_set:
+            results.extend(await self._search_snippet(query, user_id, top_k))


The current implementation of search_raw executes searches across different domains sequentially. To achieve true low-latency as intended by this PR, these searches should be executed in parallel using asyncio.gather.

Suggested change

if "profile" in domain_set:

results.extend(await self._search_profile_raw(query, user_id, top_k))

if "temporal" in domain_set:

results.extend(await self._search_temporal(query, user_id, top_k))

if "summary" in domain_set:

results.extend(await self._search_summary(query, user_id, top_k))

if "snippet" in domain_set:

results.extend(await self._search_snippet(query, user_id, top_k))

domain_set = set(domains)

tasks = []

if "profile" in domain_set:

tasks.append(self._search_profile_raw(query, user_id, top_k))

if "temporal" in domain_set:

tasks.append(self._search_temporal(query, user_id, top_k))

if "summary" in domain_set:

tasks.append(self._search_summary(query, user_id, top_k))

if "snippet" in domain_set:

tasks.append(self._search_snippet(query, user_id, top_k))

task_results = await asyncio.gather(*tasks)

results: List[SourceRecord] = [item for sublist in task_results for item in sublist]

Resolved in the current head: search_raw now builds per-domain tasks and awaits asyncio.gather(*tasks, return_exceptions=True), so the raw path runs requested domain searches concurrently while preserving healthy domain results if one domain fails. The same concurrent path now includes the code domain as well.

gemini-code-assist · 2026-05-11T18:26:41Z

+    def _get_profile_catalog(self, user_id: str):
+        cached = self._profile_catalog_cache.get(user_id)
+        now = time.monotonic()
+        if cached and cached[0] > now:
+            return cached[1], cached[2]
+
+        catalog, results = self._fetch_profile_catalog(user_id)
+        self._profile_catalog_cache[user_id] = (
+            now + _CACHE_TTL_SECONDS,
+            catalog,
+            results,
+        )
+        return catalog, results


The _get_profile_catalog method is synchronous and performs network I/O via _fetch_profile_catalog (which calls vector_store.search_by_metadata). Calling this from an async context like run or search_raw will block the entire event loop, significantly impacting performance and defeating the purpose of a low-latency path. This should be made asynchronous.

Suggested change

def _get_profile_catalog(self, user_id: str):

cached = self._profile_catalog_cache.get(user_id)

now = time.monotonic()

if cached and cached[0] > now:

return cached[1], cached[2]

catalog, results = self._fetch_profile_catalog(user_id)

self._profile_catalog_cache[user_id] = (

now + _CACHE_TTL_SECONDS,

catalog,

results,

)

return catalog, results

async def _get_profile_catalog(self, user_id: str):

cached = self._profile_catalog_cache.get(user_id)

now = time.monotonic()

if cached and cached[0] > now:

return cached[1], cached[2]

# Assuming search_by_metadata is made async or wrapped in an executor

catalog, results = await self._fetch_profile_catalog(user_id)

self._profile_catalog_cache[user_id] = (

now + _CACHE_TTL_SECONDS,

catalog,

results,

)

return catalog, results

Resolved in the current head: _get_profile_catalog is now async, and the blocking metadata fetch runs through asyncio.to_thread(self._fetch_profile_catalog, user_id), so the retrieval path no longer blocks the event loop while loading the profile catalog.

gemini-code-assist · 2026-05-11T18:26:41Z

+        self._profile_catalog_cache: Dict[str, tuple[float, List[Dict[str, str]], List[Any]]] = {}
+        self._retrieval_plan_cache: Dict[tuple[str, str, int, str], tuple[float, AIMessage]] = {}
+        self._latency_samples: Dict[str, List[float]] = {}


The caches _profile_catalog_cache and _retrieval_plan_cache are unbounded dictionaries that only expire entries on access. This can lead to a memory leak as entries for users who do not return will persist indefinitely. Consider using a cache with a maximum size and an eviction policy (e.g., cachetools.TTLCache).

Resolved in the current head: _profile_catalog_cache and _retrieval_plan_cache are now OrderedDict caches with TTL checks plus explicit max-size trimming via _trim_cache. Latency samples are also bounded to avoid unbounded growth.

gemini-code-assist · 2026-05-11T18:26:41Z

        ]

-        ai_response: AIMessage = await self.model_with_tools.ainvoke(messages)
+        plan_key = (user_id, query.strip(), top_k, catalog_text)


Using the entire catalog_text as part of the cache key for _retrieval_plan_cache can be memory-intensive if the catalog is large. Consider using a hash of the catalog_text instead.

import hashlib catalog_hash = hashlib.sha256(catalog_text.encode()).hexdigest() plan_key = (user_id, query.strip(), top_k, catalog_hash)

Resolved in the current head: the retrieval plan key now uses hashlib.sha256(catalog_text.encode(utf-8)).hexdigest() instead of storing the full catalog text in the cache key.

strongkeep-debug · 2026-05-11T20:27:37Z

Follow-up after the latest push at that time: 5548b63 included the requested code domain in the raw search path, preserved repository/file/symbol/type/severity metadata, and hardened raw search against missing or non-finite backend scores. It also kept healthy domain results if another requested domain failed.

The earlier performance review items were addressed in the same branch: raw domain searches run concurrently, profile catalog lookup is async, and both caches are bounded with hashed plan keys. Local verification at that point was 50 passed for the full suite, 12 passed for the targeted retrieval/API tests, touched-file Ruff clean, and git diff --check clean.

Update from the same 2026-05-11 head: GitHub Actions Test Suite was green, with Unit, API, and Integration Tests and End-to-End Tests passing on 5548b63. A later test-only commit, a3f6429, added missing-score regression coverage; see the newer comment for its current CI state.

ved015 · 2026-05-13T10:55:55Z

@strongkeep-debug thank you for your contribution pls review the gemini suggestions and resolve them pls make sure to also add comment on the suggestions :)

strongkeep-debug · 2026-05-13T14:03:53Z

@ved015 done. I replied directly on all four Gemini suggestions and verified that branch head covered them at the time: raw domain searches run concurrently with asyncio.gather, profile catalog loading is async via asyncio.to_thread, the profile/plan caches and latency samples are bounded, and the plan cache key uses a SHA-256 catalog hash. PR checks were green on that earlier head; the newer test-only follow-up comment below has the current CI state.

strongkeep-debug · 2026-05-13T15:08:28Z

Small follow-up pushed: a3f6429 adds a regression test proving answer_from_sources() handles sources with score=None without exposing a score marker or re-entering tool selection. This is test-only; runtime behavior is unchanged from the prior hardening commit.

Local validation on the current head:

.venv\Scripts\python -m pytest tests/unit tests/api tests/integration tests/test_deterministic_memory_layer.py tests/test_enterprise_chat.py --cov=src/utils --cov=src/schemas --cov=src/pipelines --cov=src/enterprise --cov=src/api --cov-report=term-missing --cov-report=xml --cov-fail-under=70
50 passed, coverage 79.61% (70% gate)

.venv\Scripts\python -m pytest tests/e2e
1 passed

.venv\Scripts\python -m pytest tests\integration\test_retrieval_pipeline.py tests\api\test_memory_search_routes.py -q
13 passed

git diff --check
clean

The PR label check passed on the new head. The Test Suite workflow is currently marked action_required, so it may need maintainer approval to run on the forked PR; the local commands above mirror the two jobs in .github/workflows/tests.yml with the same test env values.

strongkeep-debug · 2026-05-15T01:36:16Z

Since the hosted Test Suite is still action_required on the fork, I re-verified the current head a3f6429 in a fresh checkout with Python 3.11.

The score-serialization concern is covered on this branch in both places that can emit scores: RetrievalPipeline._score_value() coerces missing, invalid, and non-finite values before ranking/formatting, and the API route uses _safe_score() before serializing raw search sources. The regression suite includes the missing-score answer path plus the raw-search non-finite normalization path.

Current local validation:

python -m pytest tests\integration\test_retrieval_pipeline.py tests\api\test_memory_search_routes.py -q
13 passed

python -m pytest tests\unit tests\api tests\integration tests\test_deterministic_memory_layer.py tests\test_enterprise_chat.py --cov=src\utils --cov=src\schemas --cov=src\pipelines --cov=src\enterprise --cov=src\api --cov-report=term-missing --cov-report=xml --cov-fail-under=70
50 passed, coverage 79.61% (70% gate)

python -m pytest tests\e2e
1 passed

python -m ruff check src\pipelines\retrieval.py src\api\routes\memory.py tests\integration\test_retrieval_pipeline.py tests\api\test_memory_search_routes.py
All checks passed!

git diff --check
clean

Ankit-Kotnala

Thanks for the PR. One blocker before merge: raw code search seems to query the default retrieval vector store, but code annotations are written to the annotations namespace via code_vector_store.

Can you update _search_code() to read from the same annotations namespace used by ingest/weaver, and add a regression test for that? After that this should be good to merge.

strongkeep-debug · 2026-05-16T16:35:31Z

@Ankit-Kotnala addressed in 51c4150.

Raw code search now reads from the code annotation vector store instead of the default memory store. In the production path, RetrievalPipeline initializes that store with annotations_namespace(org_id), matching the namespace used by ingest/weaver. Tests can also inject a separate code_vector_store, which keeps the namespace boundary explicit.

I added a regression test that seeds a stale domain=code record in the default store and the real annotation record in the code store, then verifies /search_raw(..., [code]) only returns the annotation-store result.

Validation run locally:

.\.venv\Scripts\python -m pytest tests\integration\test_retrieval_pipeline.py -q
# 11 passed

.\.venv\Scripts\python -m pytest tests\unit tests\api tests\integration tests\test_deterministic_memory_layer.py tests\test_enterprise_chat.py -q
# 51 passed, 4 existing Pydantic deprecation warnings

git diff --check
# clean

strongkeep-debug · 2026-05-19T01:32:22Z

Updated PR #173 on current main; head is now fa5da96.

This resolves the stale/dirty branch state while keeping the requested code-search fix: production RetrievalPipeline now initializes the raw code search store through the current vector-store factory with annotations_namespace(org_id), and tests can still inject a separate code store to prove code results do not come from the default memory store.

Local validation after the rebase:

uv run --extra dev python -m pytest tests\integration\test_retrieval_pipeline.py tests\api\test_memory_search_routes.py -q
14 passed

uv run --extra dev python -m pytest tests\unit tests\api tests\integration tests\test_deterministic_memory_layer.py tests\test_enterprise_chat.py -q
54 passed

uv run --extra dev python -m pytest tests\e2e -q
1 passed

uv run --extra dev python -m ruff check src\pipelines\retrieval.py src\api\routes\memory.py tests\integration\test_retrieval_pipeline.py tests\api\test_memory_search_routes.py
All checks passed!

git diff --check
clean

Hosted label check is green on the new head. The PR is no longer merge-conflicted; it is waiting on maintainer review after the requested annotation-store change.

ved015 · 2026-05-19T06:55:57Z

@codex pls review this pr for any bugs report actual one only

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fa5da964be

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

ved015 · 2026-05-19T07:29:43Z

@codex pls review this pr and report actual bugs if any

chatgpt-codex-connector · 2026-05-19T07:34:14Z

Codex Review: Didn't find any major issues. More of your lovely PRs please.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

strongkeep-debug · 2026-05-19T13:18:59Z

CI-only follow-up pushed in a4cdba1 after the hosted fork checks exposed workflow-level failures rather than search-code failures.

What changed:

Bandit now installs the SARIF formatter before running -f sarif.
SARIF upload, API-diff PR commenting, and staging deploy are skipped for fork PRs where the token cannot write security events/comments/deployments.
No runtime retrieval/search code changed in this follow-up.

Local validation on the new head:

python yaml parse for security-scan.yml, api-schema-diff.yml, deploy-staging.yml
# ok

uv run --extra dev python -m pytest tests\integration\test_retrieval_pipeline.py tests\api\test_memory_search_routes.py -q
# 14 passed

uv run --extra dev python -m ruff check src\pipelines\retrieval.py src\api\routes\memory.py tests\integration\test_retrieval_pipeline.py tests\api\test_memory_search_routes.py
# All checks passed!

git diff --check
# clean

The new hosted workflow runs are currently action_required, which appears to be GitHub's fork/workflow-change approval gate.

strongkeep-debug · 2026-05-19T14:48:36Z

Update for reviewers: current head 3bf43f7 keeps the PR on runtime/tests only and includes one follow-up from automated review.

What changed since 2a5db42:

Successful memory ingest now invalidates the authenticated user's retrieval profile cache.
Batch ingest invalidates after each successful item.
RetrievalPipeline.invalidate_profile_cache(user_id) clears both that user's profile catalog cache and retrieval-plan cache entries, so a profile update cannot keep serving stale profile records or a plan built from the old catalog.
Added regression coverage at the route level and pipeline-cache level.

Local validation on the current head:

.\.venv\Scripts\python -m pytest tests\integration\test_retrieval_pipeline.py tests\api\test_memory_search_routes.py -q --tb=short
16 passed

.\.venv\Scripts\python -m ruff check src\api\routes\memory.py src\pipelines\retrieval.py tests\api\test_memory_search_routes.py tests\integration\test_retrieval_pipeline.py
All checks passed!

git diff --check
clean

Hosted label check is green on 3bf43f7. The remaining review state appears to be the prior requested-changes review waiting for maintainer re-review.

strongkeep-debug · 2026-05-19T23:10:55Z

Rebased this PR onto current upstream/main and re-ran the bounty-relevant search tests.

The code-search blocker from the prior review remains addressed in the rebased head: raw code search reads from the annotations-backed code_vector_store namespace, while the raw search route still covers profile/summary/temporal/snippet/code domains.

Validation:

python -m pytest tests/api/test_memory_search_routes.py tests/integration/test_retrieval_pipeline.py -q
16 passed

Could you re-review the code-store blocker when you have a chance?

This reverts commit a4cdba1.

Ankit-Kotnala · 2026-05-21T17:34:07Z

Re-reviewed the latest head. The code-store issue looks fixed now: raw code search is using the annotations/code vector store instead of the default memory store, and the regression test covers the stale default-store case.

I also checked the raw search flow, answer mode, score normalization, concurrent domain search, and cache invalidation after ingest. The targeted route and retrieval tests pass locally for me.

@ishaanxgupta Looks good from my side.

greptile-apps · 2026-05-23T09:23:09Z

Greptile Summary

This PR introduces a low-latency raw memory search path via RetrievalPipeline.search_raw, wiring it to both /v1/memory/search and a new root /search alias. It also adds optional answer synthesis from raw hits, LRU caches for profile catalogs and retrieval plans, a dedicated code-annotation vector store, and score normalisation across all domain searchers.

search_raw fans out to up to five domain searchers concurrently using asyncio.gather(..., return_exceptions=True), normalises scores, and returns results ranked by similarity without an LLM tool-selection step.
Profile catalog and retrieval-plan results are cached in bounded OrderedDict LRUs with a 60-second TTL; ingest now invalidates both caches for the relevant user.
SearchResponse gains answer, model, confidence, and latency fields; latency currently carries system-wide percentile stats from the shared singleton, leaking cross-user telemetry to every caller.

Confidence Score: 3/5

The new raw search path and caching machinery work correctly in isolation, but the latency snapshot embedded in every search response exposes system-wide request counts and agentic-mode timings to all API callers.

The latency field returned to each caller is populated from the shared pipeline singleton's accumulated samples across all users and all modes. Any user can observe how many recent requests were made in each mode (raw, answer, agentic) and the performance distribution of those requests.

src/api/routes/memory.py (latency snapshot in response) and src/pipelines/retrieval.py (code_vector_store fallback logic and double score normalisation) deserve a second look before merge.

Important Files Changed

Filename	Overview
src/pipelines/retrieval.py	Core pipeline rewrite adding search_raw, answer_from_sources, LRU caches, latency sampling, and a dedicated code annotation store. Contains a redundant double score-normalisation pass in search_raw and a subtle code_vector_store fallback that routes to the wrong store when only vector_store is injected.
src/api/routes/memory.py	Adds search_router for the root /search alias, delegates to pipeline.search_raw, and adds answer synthesis and latency recording. The latency snapshot included in every response exposes cross-user aggregate telemetry and agentic-mode stats to all callers.
src/api/schemas.py	Extends SearchRequest with answer, snippet/code domains, and adds answer/model/confidence/latency to SearchResponse. Missing the strip_query validator that RetrieveRequest has.
src/api/app.py	Registers the new search_router (root /search alias) before memory_router; minor cleanup of unused variable in sentry-debug endpoint.
tests/integration/test_retrieval_pipeline.py	Comprehensive new tests covering caching, raw search, concurrency, score normalisation, and code store isolation.
tests/api/test_memory_search_routes.py	New API-layer tests covering raw hits, answer synthesis, code domain, and cache invalidation on ingest.

Sequence Diagram

sequenceDiagram
    participant C as Client
    participant R as search_router(/search) or router(/v1/memory/search)
    participant P as RetrievalPipeline
    participant VS as VectorStore
    participant CVS as CodeVectorStore
    participant N as Neo4jClient
    participant LLM as LLM

    C->>R: "POST {query, domains, top_k, answer}"
    R->>P: search_raw(query, user_id, domains, top_k)
    par profile in domains
        P->>VS: "search_by_text(filter=profile)"
    and temporal in domains
        P->>N: search_events_by_embedding(query)
    and summary in domains
        P->>VS: "search_by_text(filter=summary)"
    and snippet in domains
        P->>VS: "search_by_text(filter=snippet)"
    and code in domains
        P->>CVS: "search_by_text(filter=code)"
    end
    P-->>R: List[SourceRecord] ranked by score
    alt "answer=true"
        R->>P: answer_from_sources(query, sources)
        P->>LLM: ainvoke(ANSWER_PROMPT + context)
        LLM-->>P: answer text
        P-->>R: str answer
    end
    R->>P: record_latency(mode, elapsed_ms)
    R->>P: get_latency_snapshot()
    P-->>R: "{raw/answer/agentic: {count, p50, p95, p99}}"
    R-->>C: "APIResponse{results, total, answer, model, confidence, latency}"

_{Reviews (1): Last reviewed commit: "Invalidate profile cache after memory in..." | Re-trigger Greptile}

greptile-apps · 2026-05-23T09:23:12Z

+            answer=answer,
+            model=_model_name(pipeline.model) if req.answer else "",
+            confidence=min(1.0, len(all_results) * 0.2) if answer else 0.0,
+            latency=pipeline.get_latency_snapshot(),


Cross-user latency telemetry in every response

get_latency_snapshot() returns the shared pipeline singleton's accumulated _latency_samples dict, which collects data from all users and all modes (raw, answer, and agentic). Every authenticated caller therefore receives a latency object that includes the count and percentiles of other users' requests — including from the /v1/memory/retrieve agentic endpoint that has nothing to do with search. The count value reveals how many recent requests have been processed system-wide, making this a side-channel that leaks activity patterns across the user base. This data belongs in the existing Prometheus /metrics endpoint, not in a per-call user response.

greptile-apps · 2026-05-23T09:23:13Z

    @field_validator("domains")
    @classmethod
    def validate_domains(cls, v: List[str]) -> List[str]:


Missing strip_query validator on SearchRequest

RetrieveRequest strips leading/trailing whitespace from query via a field_validator, but SearchRequest does not. The agentic pipeline's plan cache already normalises with query.strip() on its own, so a search for "latency " and "latency" would hit the same cache entry but be stored as different raw keys. Adding the validator keeps the two request models consistent.

Suggested change

@field_validator("domains")

@classmethod

def validate_domains(cls, v: List[str]) -> List[str]:

@field_validator("query")

@classmethod

def strip_query(cls, v: str) -> str:

return v.strip()

@field_validator("domains")

@classmethod

def validate_domains(cls, v: List[str]) -> List[str]:

greptile-apps · 2026-05-23T09:23:15Z

+        if code_vector_store is not None:
+            self.code_vector_store = code_vector_store
+        elif vector_store is None:
+            self.code_vector_store = get_vector_store(
+                namespace=annotations_namespace(org_id),
+                create_if_not_exists=False,
+            )
+        else:
+            self.code_vector_store = self.vector_store


code_vector_store falls back to the main store when a custom vector_store is passed

The condition elif vector_store is None means: when a caller supplies an explicit vector_store (but no code_vector_store), code annotations are read from that same general-purpose store rather than the org-scoped annotations namespace. In production this path is likely never hit (both args are None), but any integration test that passes only vector_store= will silently route code-domain queries to the wrong namespace and miss real annotation data. Consider changing the condition to elif code_vector_store is None or at least documenting the fallback intent clearly.

greptile-apps · 2026-05-23T09:23:16Z

+        results = [
+            record
+            for domain_results in task_results
+            if not self._log_search_error(domain_results)
+            for record in domain_results
+        ]
+        for record in results:
+            record.score = self._score_value(record.score)
+
+        return sorted(results, key=lambda record: record.score, reverse=True)


Double application of _score_value in search_raw

Every per-domain search helper (_search_profile_raw, _search_temporal, _search_summary, _search_snippet, _search_code) already calls self._score_value(r.score) when constructing each SourceRecord. The loop at the end of search_raw then calls _score_value a second time on every record. While idempotent for well-formed floats, it is redundant work and could confuse readers into thinking the inner helpers do not normalise scores. The second pass can be removed.

strongkeep-debug requested review from ishaanxgupta and ved015 as code owners May 11, 2026 18:21

github-actions Bot added tests api pipelines labels May 11, 2026

gemini-code-assist Bot reviewed May 11, 2026

View reviewed changes

michael-schvarcz mentioned this pull request May 11, 2026

Add raw search fast path with code domain support #174

Open

strongkeep-debug mentioned this pull request May 12, 2026

Add low-latency raw search path separate from agentic answer synthesis #163

Open

ishaanxgupta requested a review from Ankit-Kotnala May 16, 2026 12:54

Ankit-Kotnala requested changes May 16, 2026

View reviewed changes

strongkeep-debug force-pushed the codex/163-search-fast-path branch from 51c4150 to fa5da96 Compare May 19, 2026 01:31

chatgpt-codex-connector Bot reviewed May 19, 2026

View reviewed changes

Comment thread src/pipelines/retrieval.py

strongkeep-debug had a problem deploying to staging May 19, 2026 09:36 — with GitHub Actions Failure

strongkeep-debug force-pushed the codex/163-search-fast-path branch from 3bf43f7 to ec8d3ed Compare May 19, 2026 23:10

strongkeep-debug force-pushed the codex/163-search-fast-path branch 3 times, most recently from d3f97f6 to 46affc9 Compare May 21, 2026 11:33

strongkeep-debug added 10 commits May 21, 2026 06:31

Add low-latency memory search path

b0ee54d

Tighten low-latency retrieval path

47c31fe

Include code annotations in raw search

d07e782

Harden raw search score handling

e4fb402

Add raw answer missing-score regression

5c79aec

Search code annotations from code store

4cfcca3

ci: harden fork PR checks

124cfae

Revert "ci: harden fork PR checks"

a141611

This reverts commit a4cdba1.

Fix Sentry probe lint

5b04cc7

Invalidate profile cache after memory ingest

9aaf5f8

strongkeep-debug force-pushed the codex/163-search-fast-path branch from 46affc9 to 9aaf5f8 Compare May 21, 2026 13:35

greptile-apps Bot reviewed May 23, 2026

View reviewed changes

Conversation

strongkeep-debug commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

strongkeep-debug May 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

strongkeep-debug May 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

strongkeep-debug May 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

strongkeep-debug May 13, 2026

Choose a reason for hiding this comment

Uh oh!

strongkeep-debug commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ved015 commented May 13, 2026

Uh oh!

strongkeep-debug commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

strongkeep-debug commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

strongkeep-debug commented May 15, 2026

Uh oh!

Ankit-Kotnala left a comment

Choose a reason for hiding this comment

Uh oh!

strongkeep-debug commented May 16, 2026

Uh oh!

strongkeep-debug commented May 19, 2026

Uh oh!

ved015 commented May 19, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ved015 commented May 19, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 19, 2026

Uh oh!

strongkeep-debug commented May 19, 2026

Uh oh!

strongkeep-debug commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

strongkeep-debug commented May 19, 2026

Uh oh!

Ankit-Kotnala commented May 21, 2026

Uh oh!

greptile-apps Bot commented May 23, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot May 23, 2026

Choose a reason for hiding this comment

strongkeep-debug commented May 11, 2026 •

edited

Loading

strongkeep-debug commented May 11, 2026 •

edited

Loading

strongkeep-debug commented May 13, 2026 •

edited

Loading

strongkeep-debug commented May 13, 2026 •

edited

Loading

strongkeep-debug commented May 19, 2026 •

edited

Loading